AITopics | variational bayes

When considering a model selection or, more generally, an aggregation approach for adaptive statistical inference, it is often necessary to compute estimators over a wide range of model complexities including unnecessarily large models even when the true data-generating process is relatively simple, due to the lack of prior knowledge. This requirement can lead to substantial computational inefficiency. In this work, we propose a novel framework for efficient model aggregation called the early-stopped aggregation (ESA): instead of computing and aggregating estimators for all candidate models, we compute only a small number of simpler ones using an early-stopping criterion and aggregate only these for final inference. Our framework is versatile and applies to both Bayesian model selection, in particular, within the variational Bayes framework, and frequentist estimation, including a general penalized estimation setting. We investigate adaptive optimal property of the ESA approach across three learning paradigms. We first show that ESA achieves optimal adaptive contraction rates in the variational Bayes setting under mild conditions. We extend this result to variational empirical Bayes, where prior hyperparameters are chosen in a data-dependent manner. In addition, we apply the ESA approach to frequentist aggregation including both penalization-based and sample-splitting implementations, and establish corresponding theory. As we demonstrate, there is a clear unification between early-stopped Bayes and frequentist penalized aggregation, with a common "energy" functional comprising a data-fitting term and a complexity-control term that drives both procedures. We further present several applications and numerical studies that highlight the efficiency and strong performance of the proposed approach.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2604.14404

Country:

Europe > United Kingdom (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre: Research Report (0.50)

Add feedback

Finite-Dimensional BFRY Priors and Variational Bayesian Inference for Power Law Models

Juho Lee, Lancelot F. James, Seungjin Choi

Neural Information Processing SystemsMar-23-2026, 01:23:37 GMT

Bayesian nonparametric methods based on the Dirichlet Process (DP), gamma process and beta process, have proven effective in capturing aspects of various datasets arising in machine learning. However, it is now recognized that such processes have their limitations in terms of the ability to capture power law behavior. As such there is now considerable interest in models based on the Stable Processs (SP), Generalized Gamma process (GGP) and Stable-Beta Process (SBP).

Add feedback

7087c949df293f13c0052ac825936e6f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 18:33:15 GMT

artificial intelligence, diffusion model, machine learning, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

eae27d77ca20db309e056e3d2dcd7d69-Paper.pdf

Neural Information Processing SystemsFeb-14-2026, 22:33:06 GMT

Furthermore, we show that the WAE objective is related to other statistical quantities such as thef-divergence and in particular, upper bounded by the Wasserstein distance, which then allows us to tap into existing efficient(regularized)optimaltransportsolvers.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Add feedback

6aaba9a124857622930ca4e50f5afed2-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 02:43:44 GMT

inference, optimization, variational distribution, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Variational Bayes under Model Misspecification

Neural Information Processing SystemsDec-26-2025, 03:48:39 GMT

Variational Bayes (VB) is a scalable alternative to Markov chain Monte Carlo (MCMC) for Bayesian posterior inference. Though popular, VB comes with few theoretical guarantees, most of which focus on well-specified models. However, models are rarely well-specified in practice. In this work, we study VB under model misspecification. We prove the VB posterior is asymptotically normal and centers at the value that minimizes the Kullback-Leibler (KL) divergence to the true data-generating distribution. Moreover, the VB posterior mean centers at the same value and is also asymptotically normal.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.64)

Add feedback

Robust Variational Bayes by Min-Max Median Aggregation

Yan, Jiawei, Liu, Ju, Liu, Weidong, Tu, Jiyuan

arXiv.org Machine LearningDec-16-2025

We propose a robust and scalable variational Bayes (VB) framework designed to effectively handle contamination and outliers in dataset. Our approach partitions the data into $m$ disjoint subsets and formulates a joint optimization problem based on robust aggregation principles. A key insight is that the full posterior distribution is equivalent to the minimizer of the mean Kullback-Leibler (KL) divergence from the $m$-powered local posterior distributions. To enhance robustness, we replace the mean KL divergence with a min-max median formulation. The min-max formulation not only ensures consistency between the KL minimizer and the Evidence Lower Bound (ELBO) maximizer but also facilitates the establishment of improved statistical rates for the mean of variational posterior. We observe a notable discrepancy in the $m$-powered marginal log likelihood function contingent on the presence of local latent variables. To address this, we treat these two scenarios separately to guarantee the consistency of the aggregated variational posterior. Specifically, when local latent variables are present, we introduce an aggregate-and-rescale strategy. Theoretically, we provide a non-asymptotic analysis of our proposed posterior, incorporating a refined analysis of Bernstein-von Mises (BvM) theorem to accommodate a diverging number of subsets $m$. Our findings indicate that the two-stage approach yields a smaller approximation error compared to directly aggregating the $m$-powered local posteriors. Furthermore, we establish a nearly optimal statistical rate for the mean of the proposed posterior, advancing existing theories related to min-max median estimators. The efficacy of our method is demonstrated through extensive simulation studies.

kl divergence, local latent variable, posterior distribution, (12 more...)

arXiv.org Machine Learning

2512.12676

Country:

Europe > Spain > Galicia > Madrid (0.05)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Variational bagging: a robust approach for Bayesian uncertainty quantification

Fan, Shitao, Ohn, Ilsang, Dunson, David, Lin, Lizhen

arXiv.org Machine LearningNov-26-2025

Variational Bayes methods are popular due to their computational efficiency and adaptability to diverse applications. In specifying the variational family, mean-field classes are commonly used, which enables efficient algorithms such as coordinate ascent variational inference (CAVI) but fails to capture parameter dependence and typically underestimates uncertainty. In this work, we introduce a variational bagging approach that integrates a bagging procedure with variational Bayes, resulting in a bagged variational posterior for improved inference. We establish strong theoretical guarantees, including posterior contraction rates for general models and a Bernstein-von Mises (BVM) type theorem that ensures valid uncertainty quantification. Notably, our results show that even when using a mean-field variational family, our approach can recover off-diagonal elements of the limiting covariance structure and provide proper uncertainty quantification. In addition, variational bagging is robust to model misspecification, with covariance structures matching those of the target covariance. We illustrate our variational bagging method in numerical studies through applications to parametric models, finite mixture models, deep neural networks, and variational autoencoders (VAEs).

dunson and lin variational, imsart-generic ver, variational, (12 more...)

arXiv.org Machine Learning

2511.20594

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Variational Bayes on Monte Carlo Steroids

Neural Information Processing SystemsNov-21-2025, 14:16:22 GMT

Variational approaches are often used to approximate intractable posteriors or normalization constants in hierarchical latent variable models. While often effective in practice, it is known that the approximation error can be arbitrarily large. We propose a new class of bounds on the marginal log-likelihood of directed latent variable models. Our approach relies on random projections to simplify the posterior. In contrast to standard variational methods, our bounds are guaranteed to be tight with high probability. We provide a new approach for learning latent variable models based on optimizing our new bounds on the log-likelihood. We demonstrate empirical improvements on benchmark datasets in vision and language for sigmoid belief networks, where a neural network is used to approximate the posterior.

latent variable model, name change, variational bayes, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Add feedback

Coupled Variational Bayes via Optimization Embedding

Neural Information Processing SystemsNov-20-2025, 22:23:04 GMT

Variational inference plays a vital role in learning graphical models, especially on large-scale datasets. Much of its success depends on a proper choice of auxiliary distribution class for posterior approximation. However, how to pursue an auxiliary distribution class that achieves both good approximation ability and computation efficiency remains a core challenge. In this paper, we proposed coupled variational Bayes which exploits the primal-dual view of the ELBO with the variational distribution class generated by an optimization procedure, which is termed optimization embedding.

graphical model, name change, variational bayes, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.58)

Add feedback

Filters

Collaborating Authors

variational bayes

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Early-stopped aggregation: Adaptive inference with computational efficiency

Finite-Dimensional BFRY Priors and Variational Bayesian Inference for Power Law Models

7087c949df293f13c0052ac825936e6f-Paper-Conference.pdf

eae27d77ca20db309e056e3d2dcd7d69-Paper.pdf

6aaba9a124857622930ca4e50f5afed2-Paper.pdf

Variational Bayes under Model Misspecification

Robust Variational Bayes by Min-Max Median Aggregation

Variational bagging: a robust approach for Bayesian uncertainty quantification

Variational Bayes on Monte Carlo Steroids

Coupled Variational Bayes via Optimization Embedding